32 research outputs found
Data Model and Query Constructs for Versatile Web Query Languages
As the Semantic Web is gaining momentum, the need for
truly versatile query languages becomes increasingly apparent. A Web
query language is called versatile if it can access in the same query program
data in different formats (e.g. XML and RDF). Most query languages
are not versatile: they have not been specifically designed to cope
with both worlds, providing a uniform language and common constructs
to query and transform data in various formats. Moreover, most of them
do not provide a flexible data model that is powerful enough to naturally
convey both Semantic Web data formats (especially RDF and
Topic Maps) and XML. This article highlights challenges related to the
data model and language constructs for querying both standard Web
and Semantic Web data with an emphasis on facilitating sophisticated
reasoning. It is shown that Xcerpt’s data model and querying constructs
are particularly well-suited for the Semantic Web, but that some adjustments
of the Xcerpt syntax allow for even more effective and natural
querying of RDF and Topic Maps
Model Theory and Entailment Rules for RDF Containers, Collections and Reification
An RDF graph is, at its core, just a set of statements consisting of subjects, predicates and objects. Nevertheless, since its inception
practitioners have asked for richer data structures such as containers (for
open lists, sets and bags), collections (for closed lists) and reification (for
quoting and provenance). Though this desire has been addressed in the
RDF primer and RDF Schema specification, they are explicitely ignored
in its model theory. In this paper we formalize the intuitive semantics
(as suggested by the RDF primer, the RDF Schema and RDF semantics specifications) of these compound data structures by two orthogonal
extensions of the RDFS model theory (RDFCC for RDF containers and
collections, and RDFR for RDF reification). Second, we give a set of
entailment rules that is sound and complete for the RDFCC and RDFR
model theories. We show that complexity of RDFCC and RDFR entailment remains the same as that of simple RDF entailment
AMaχoS—Abstract Machine for Xcerpt
Web query languages promise convenient and efficient access
to Web data such as XML, RDF, or Topic Maps. Xcerpt is one such Web
query language with strong emphasis on novel high-level constructs for
effective and convenient query authoring, particularly tailored to versatile
access to data in different Web formats such as XML or RDF.
However, so far it lacks an efficient implementation to supplement the
convenient language features. AMaχoS is an abstract machine implementation
for Xcerpt that aims at efficiency and ease of deployment. It
strictly separates compilation and execution of queries: Queries are compiled
once to abstract machine code that consists in (1) a code segment
with instructions for evaluating each rule and (2) a hint segment that
provides the abstract machine with optimization hints derived by the
query compilation. This article summarizes the motivation and principles
behind AMaχoS and discusses how its current architecture realizes
these principles
Data Integration on the (Semantic) Web with Rules and Rich Unification
For the last decade a multitude of new data formats for the World Wide Web
have been developed, and a huge amount of heterogeneous semi-structured data
is flourishing online. With the ever increasing number of documents on the
Web, rules have been identified as the means of choice for reasoning about
this
data, transforming and integrating it. Query languages such as SPARQL and rule
languages such as Xcerpt use compound queries that are matched or unified with
semi-structured data. This notion of unification is different from the one
that is known from logic programming engines in that it (i) provides
constructs that allow queries to be incomplete in several ways (ii) in that
variables may have different types, (iii) in that it results in sets of
substitutions for the variables in the query instead of a single substitution
and (iv) in that subsumption between queries is much harder to decide than in
logic programming.
This thesis abstracts from Xcerpt query term simulation, SPARQL graph pattern
matching and XPath XML document matching, and shows that all of them can be
considered as a form of rich unification. Given a set of mappings between
substitution sets of different languages, this abstraction opens up the
possibility for format-versatile querying, i.e. combination of queries in
different formats, or transformation of one format into another format within
a single rule.
To show the superiority of this approach, this thesis introduces an extension
of Xcerpt called Xcrdf, and describes use-cases for the combined querying
and integration of RDF and XML data. With XML being the predominant Web
format, and RDF the predominant Semantic Web format, Xcrdf extends Xcerpt
by a set of RDF query terms and construct terms, including query primitives
for RDF containers collections and reifications. Moreover, Xcrdf includes
an RDF path query language called RPL that is more expressive than previously
proposed polynomial-time RDF path query languages, but can still be evaluated
in polynomial time combined complexity.
Besides the introduction of this framework for data integration based on rich
unification, this thesis extends the theoretical knowledge about Xcerpt in
several ways: We show that Xcerpt simulation unification is decidable, and
give complexity bounds for subsumption in several fragments of Xcerpt query
terms. The proof is based on a set of subsumption monotone query term
transformations, and is only feasible because of the injectivity requirement
on subterms of Xcerpt queries. The proof gives rise to an algorithm for
deciding Xcerpt query term simulation. Moreover, we give a semantics to
locally and weakly stratified Xcerpt programs, but this semantics is
applicable not only to Xcerpt, but to any rule language with rich unification,
including multi-rule SPARQL programs. Finally, we show how Xcerpt grouping
stratification can be reduced to Xcerpt negation stratification, thereby also
introducing the notion of local grouping stratification and weak grouping
stratification
Taming Existence in RDF Querying
We introduce the recursive, rule-based RDF query language
RDFLog. RDFLog extends previous RDF query languages by arbitrary
quantifier alternation: blank nodes may occur in the scope of all, some,
or none of the universal variables of a rule. In addition RDFLog is aware
of important RDF features such as the distinction between blank nodes,
literals and URIs or the RDFS vocabulary. The semantics of RDFLog is
closed (every answer is an RDF graph), but lifts RDF’s restrictions on
literal and blank node occurrences for intermediary data. We show how
to define a sound and complete operational semantics that can be implemented
using existing logic programming techniques. Using RDFLog
we classify previous approaches to RDF querying along their support for
blank node construction and show equivalence between languages with
full quantifier alternation and languages with only ∀∃ rules
Effective and Efficient Data Access in the Versatile Web Query Language Xcerpt
Access to Web data has become an integral part of many applications
and services. In the past, such data has usually been accessed
through human-tailoredHTMLinterfaces.Nowadays, rich client interfaces
in desktop applications or, increasingly, in browser-based clients ease data
access and allow more complex client processing based on XML or RDF
data retrieved throughWeb service interfaces. Convenient specifications of
the data processing on the client and flexible, expressive service interfaces
for data access become essential in this context.Web query languages such
as XQuery, XSLT, SPARQL, or Xcerpt have been tailored specifically for
such a setting: declarative and efficient access and processing ofWeb data.
Xcerpt stands apart among these languages by its versatility, i.e., its ability
to access not just oneWeb format but many. In this demonstration, two aspects
of Xcerpt are illustrated in detail: The first part of the demonstration
focuses on Xcerpt’s pattern matching constructs and rules to enable effective
and versatile data access. It uses a concrete practical use case from
bibliography management to illustrate these language features. Xcerpt’s
visual companion language visXcerpt is used to provide an intuitive interface
to both data and queries. The second part of the demonstration shows
recent advancements in Xcerpt’s implementation focusing on experimental
evaluation of recent complexity results and optimization techniques, as
well as scalability over a number of usage scenarios and input sizes
Foundations of Rule-Based Query Answering
This survey article introduces into the essential concepts and methods underlying rule-based query languages. It covers four complementary areas: declarative semantics based on adaptations of mathematical logic, operational semantics, complexity and expressive power, and optimisation of query evaluation.
The treatment of these areas is foundation-oriented, the foundations having resulted from over four decades of research in the logic programming and database communities on combinations of query languages and rules. These results have later formed the basis for conceiving, improving, and implementing several Web and Semantic Web technologies, in particular query languages such as XQuery or SPARQL for querying relational, XML, and RDF data, and rule languages like the “Rule Interchange Framework (RIF)” currently being developed in a working group of the W3C.
Coverage of the article is deliberately limited to declarative languages in a classical setting: issues such as query answering in F-Logic or in description logics, or the relationship of query answering to reactive rules and events, are not addressed